Tetrahedron

centered image

This environment is part of the Mujoco environments which contains general information about the environment.

Action Space Box(-1.0, 1.0, (6,), float32)
Observation Space Box(-inf, inf, (24,), float64)
import gymnasium.make("Tetrahedron-v1")

Description

This environment consist of a tetrahedron truss robot where the each edge (member) of the tetrahedron is a linear actuator and the vertices (nodes) are the passive joints. The robot is controlled by changing the length of the linear actuators either dynamically or by changing the position of the center of mass of the robot. The goal of the robot is to reach the target position.

Notes

Problem parameters:

Action Space

The action space is a Box(0,2,(6,), float32). An action is a vector of 6 elements where each element is the length of the member. The length of the member is between 0 and 2 meters.

Num Action Min Max Name Joint Unit
0 position 0 2 [Member_1] slide lenght (m)
1 position 0 2 [Member_2] slide lenght (m)
2 position 0 2 [Member_3] slide lenght (m)
3 position 0 2 [Member_4] slide lenght (m)
4 position 0 2 [Member_5] slide lenght (m)
5 position 0 2 [Member_6] slide lenght (m)

notes: might need to change the position control to force control, thus we can have force as an input and lenght as an observation.

Observation Space

Observations capture the positional values of the center of mass and the respective time derivative aka velocity of the center of mass. Additionally, the positional values of the nodes are also captured. The observation space is a Box(-inf, inf, (18,), float64). We might need to alter the observation space to have a better learning performance.

Num Observation Min Max Name Joint Unit
0 x-coord of CoM -inf inf coord_x free (m)
1 y-coord of CoM -inf inf coord_y free (m)
2 z-coord of CoM -inf inf coord_z free (m)
3 x-linvel of CoM -inf inf vel_x free (m/s)
4 y-linvel of CoM -inf inf vel_y free (m/s)
5 z-linvel of CoM -inf inf vel_z free (m/s)
6 length of [M_1] -inf inf length_1 slide (m)
7 length of [M_2] -inf inf length_2 slide (m)
8 length of [M_3] -inf inf length_3 slide (m)
9 length of [M_4] -inf inf length_4 slide (m)
10 length of [M_5] -inf inf length_5 slide (m)
11 length of [M_6] -inf inf length_6 slide (m)
12 x-coord of [N_1] -inf inf coord_x free (m)
13 y-coord of [N_1] -inf inf coord_y free (m)
14 z-coord of [N_1] -inf inf coord_z free (m)
15 x-coord of [N_2] -inf inf coord_x free (m)
16 y-coord of [N_2] -inf inf coord_y free (m)
17 z-coord of [N_2] -inf inf coord_z free (m)
18 x-coord of [N_3] -inf inf coord_x free (m)
19 y-coord of [N_3] -inf inf coord_y free (m)
20 z-coord of [N_3] -inf inf coord_z free (m)
21 x-coord of [N_4] -inf inf coord_x free (m)
22 y-coord of [N_4] -inf inf coord_y free (m)
23 z-coord of [N_4] -inf inf coord_z free (m)

notes: for the dynamic movement, linear forces rather than positions are controlled.
notes: might want to minimize the lengths at the end, or final configuration enforces that.

Rewards

In the further implementation, the reward function might be changed to a more complex one. An example is the negative of the distance between the center of mass and the target position. The reward function is defined as:

r=−(xcm−xtarget)2+(ycm−ytarget)2+(zcm−ztarget)2 r = -\sqrt{(x_{cm} - x_{target})^2 + (y_{cm} - y_{target})^2 + (z_{cm} - z_{target})^2}

another cost can be:

Episode Termination

The truss robot is said to be unhealthy if any of the following conditions are met:

The episode is terminated if the robot is unhealthy or the robot reaches the target position. trunckated at 1000 steps.